Code
ggplot(ElephantsMF, aes(x=Age,y= Height,color=Sex)) +
geom_point() +
geom_smooth(method="lm", formula= y~ poly(x, 2), se=TRUE) +
xlab("Age (Years)") +ylab("ShoulderHeight(cm)")
Both are acceptable and can capture the non-linear relationship between height and age, but a quandratic will eventually “bend” (up or down) in both directions.
Mapping color to Sex results in an interactive model being plotted:
newdata <- data.frame(expand.grid(Sex = c("M", "F"),
Age = seq(0, 33, by = 1)))
newdata$phat <- predict(lm.ele, newdata =newdata)
ggplot(ElephantsMF, aes(x=Age,y= Height,color=Sex)) +
geom_point() +
geom_line(data = newdata, aes(Age, phat, col = Sex), lty =2, lwd =2)+
xlab("Age (Years)") +ylab("ShoulderHeight(cm)") +
theme_bw()
Could in principle compare models (e.g., using AIC) that have varying numbers of knots, or different knot locations
Choose a small number of knots (df), based on how much data you have and how complex you expect the relationship to be a priori
Linear regression:
\[TF_i \sim N(\mu_i, \sigma^2)\] \[\mu_i = \beta_0 + \beta_1DBH_i\]
Minimizes \(\sum_i (TF_i - \beta_0 + \beta_1DBH_i)^2\)
GLS varPower model:
\[TF_i \sim N(\mu_i, \sigma_i^2)\] \[\mu_i = \beta_0 + \beta_1DBH_i\] \[\sigma_i = \sigma^2|DBH_i|^{2\delta}\]
Minimizes: \(\sum_i \frac{(Y - \beta_0 + \beta_1DBH_i)^2}{\sigma^2|DBH_i|^{2\delta}}\)
| linear model | varPower | |
|---|---|---|
| (Intercept) | 0.196 (0.280) | 0.028 (0.113) |
| DBH | 0.384 (0.013) | 0.393 (0.010) |